In [ ]:
from codefiles.datagen import x_plus_noise
import numpy as np
import seaborn as sns
%matplotlib inline

Correlation Matrix

By calling df.corr() on a full pandas DataFrame will return a square matrix containing all pairs of correlations.

By plotting them as a heatmap, you can visualize many correlations more efficiently.

Correlation matrix with two perfectly correlated features


In [ ]:
df = x_plus_noise(randomness=0)
sns.heatmap(df.corr(), vmin=0, vmax=1)
df.corr()

Correlation matrix with mildly-correlated features


In [ ]:
df = x_plus_noise(randomness=0.5)
sns.heatmap(df.corr(), vmin=0, vmax=1)
df.corr()

Correlation matrix with not-very-correlated features


In [ ]:
df = x_plus_noise(randomness=1)
sns.heatmap(df.corr(), vmin=0, vmax=1)
df.corr()